8 research outputs found

    Improved ESP-index: a practical self-index for highly repetitive texts

    Full text link
    While several self-indexes for highly repetitive texts exist, developing a practical self-index applicable to real world repetitive texts remains a challenge. ESP-index is a grammar-based self-index on the notion of edit-sensitive parsing (ESP), an efficient parsing algorithm that guarantees upper bounds of parsing discrepancies between different appearances of the same subtexts in a text. Although ESP-index performs efficient top-down searches of query texts, it has a serious issue on binary searches for finding appearances of variables for a query text, which resulted in slowing down the query searches. We present an improved ESP-index (ESP-index-I) by leveraging the idea behind succinct data structures for large alphabets. While ESP-index-I keeps the same types of efficiencies as ESP-index about the top-down searches, it avoid the binary searches using fast rank/select operations. We experimentally test ESP-index-I on the ability to search query texts and extract subtexts from real world repetitive texts on a large-scale, and we show that ESP-index-I performs better that other possible approaches.Comment: This is the full version of a proceeding accepted to the 11th International Symposium on Experimental Algorithms (SEA2014

    Rank and Select Operations on Binary Strings

    Full text link
    Rank and Select Operations on Binary String

    Dynamizing Succinct Tree Representations

    No full text
    Abstract. We consider succinct, or space-efficient, representations of ordinal trees. Representations exist that take 2n + o(n) bits to represent a static n-node ordinal tree – close to the information-theoretic minimum – and support navigational operations in O(1) time on a RAM model; and some implementations have good practical performance. The situation is different for dynamic ordinal trees. Although there is theoretical work on succinct dynamic ordinal trees, there is little work on the practical performance of these data structures. Motivated by applications to representing XML documents, in this paper, we report on a preliminary study on dynamic succinct data structures. Our implementation is based on representing the tree structure as a sequence of balanced parentheses, with navigation done using the min-max tree of Sadakane and Navarro (SODA ’10). Our implementation shows promising performance for update and navigation, and our findings highlight two issues that we believe will be important to future implementations: the difference between the finger model of (say) Farzan and Munro (ICALP ’09) and the parenthesis model of Sadakane and Navarro, and the choice of the balanced tree used to represent the min-max tree.

    Compressed Prefix Sums

    No full text
    corecore